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Abstract 

a.  it.  J 


%lat  this  paper,  J^empirically  evaluate  .the  quality  of  several  load  indices  in  the  con-  J? 
text  of  dynamic  load  balancing.  Vfe  have  implemented  a  load  balancer  for  Sun/UNIXf 
environments.  In  our  experimental  setup,  six  Sun-2  workstations  were  driven  by  job 
scripts,  and  job  response  times  were  measured  while  loads  were  being  balanced  and  vari¬ 
ous  load  indices  used  to  make  job  placement  decisions.  We  study  the  effects  on  perfor¬ 
mance  of  the  choice  of  load  index,  the  averaging  interval,  the  load  information  exchange 
period,  and  the  characteristics  of  the  workload.  Measurements  show  that  the  performance 
benefits  of  load  balancing  are  indeed  strongly  dependent  upon  the  load  index.  Load 
indices  based  on  resource  queue  lengths  are  found  to  perform  better  than  those  based  on 
resource  utilization,  and  the  use  of  an  exponential  smoothing  method  yields  further 
improvement  over  that  of  instantaneous  queue  lengths. 


t  This  work  w as  partially  sponsored  by  the  Defense  Advanced  Research  Projects  Agency  (DoD),  Arpa 
Order  No.  4871,  monitored  by  Space  and  Naval  Warfare  Systems  Command  under  Contract  No. 
N0003S-84-C-0089,  and  by  the  National  Science  Foundation  under  grant  DMC- 8603575.  The  views 
and  conclusions  contained  in  this  document  are  those  of  the  authors  and  should  not  be  interpreted  as 
representing  official  policies,  eitner  expressed  or  implied,  of  the  Defense  Research  Projects  Agency  or 
of  the  US  Government. 

t  UNDC  is  a  trademark  of  AT&T  Bell  Laboratories. 
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1.  INTRODUCTION 

In  a  loosely-coupled  distributed  system,  the  potential  for  resource  sharing  and  its 
possible  rewards  are  substantial.  Two  frequently  cited  advantages  of  resource  sharing  are 
the  larger  number  of  accessible  resources,  in  terms  of  both  type  and  quantity,  and  the 
higher  reliability  t*«at  may  result  from  the  multiplicity  of  available  resources.  In  order  to 
share  these  resources  effectively,  some  measure  of  the  loads  being  imposed  on  the  resources 
has  to  be  made  available  to  the  clients.  The  information  about  resource  loads  is  part  of 
the  system’s  state,  and  is  among  the  most  rapidly  changing  aspects  of  it.  Since  the  loads 
are  likely  to  be  changing  all  the  time,  load  information  tends  to  become  stale  rapidly.  To 
quantify  the  concept  of  load,  we  use  a  load  index,  which  preferably  is  a  non-negative  vari¬ 
able  taking  on  a  zero  value  if  the  resource  is  idle,  and  increasing  positives  values  as  the 
load  increases.  This  paper  is  concerned  with  the  quality  of  the  possible  load  indices  for 
hosts  in  a  particular  but  important  application  of  load  indices,  that  of  load  balancing  in 
distributed  systems. 

A  job  arriving  at  a  host  will  very  likely  demand  services  from  a  number  of  resources 
(e.g.,  CPU  and  disks).  Hence,  it  is  important  to  define  not  only  the  I  ad  on  a  single 
resource  in  a  host,  but  also  that  of  the  host  viewed  as  a  collection  of  resources.  Since  the 
resource  consumption  patterns  of  the  jobs  are  likely  to  be  different,  it  may  not  be  mean¬ 
ingful  to  talk  about  “the  load”  of  the  host.  For  example,  the  CPU  may  be  heavily  cong¬ 
ested,  while  the  disks  are  not.  In  this  case,  to  an  incoming  CPU-bound  job  the  host’s  load 
is  very  high,  whereas  to  an  incoming  I/O-bound  job  the  host’s  load  is  low  because  it  will 
not  experience  much  queueing  at  the  disks.  This  observation  is  formalized  in  [Ferrari86], 
where  a  job  type-dependent  load  index  based  on  the  resource  iueue  lengths  is  proposed 
and  experimentally  evaluated. 

Load  information  is  important  since  it  can  serve  as  the  basis  oi  the  efforts  to  improve 
the  system  s  performance  by  redistributing  the  loads.  It  is  frequently  observed  that,  in  a 
distributed  system,  the  loads  of  the  hosts  are  not  evenly  distributed  all  the  time.  Livny 
and  Melman  pointed  out  that,  for  a  queueing  system  consisting  of  multiple  homogeneous 
service  centers  with  Poisson  arrivals  of  identical  rates,  the  probability  of  some  hosts  being 
idle  while  some  others  have  more  than  one  job  can  be  very  significant;  hence,  redistribut¬ 
ing  the  workload  among  the  resources  has  the  potential  of  improving  performance 
(Livny82]. 

In  order  to  evaluate  the  quality  of  a  load  index  for  load  balancing,  we  specify  a 
number  of  criteria,  or  desirable  properties.  These  criteria,  in  turn,  are  dependent  on  the 
objective  of  load  balancing,  i.e.,  the  performance  index  that  is  to  be  optimized  by  balanc¬ 
ing  the  loads.  In  this  research,  we  are  mostly  concerned  with  interactive  computing 
environments,  where  the  job  response  time  and  its  predictability  are  very  important  meas¬ 
ures  of  system  performance.  Therefore,  we  use  the  mean  job  response  time  as  our  perfor¬ 
mance  index,  supplemented  by  the  standard  deviation  of  the  response  times.  A  good  load 
index  should: 

1)  be  able  to  reflect  our  qualitative  estimates  of  the  current  load  on  a  host; 

2)  be  usable  to  predict  the  load  in  the  near  future,  since  the  response  time  of  a  job  will 

be  affected  more  by  the  future  load  than  by  the  present  load; 
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where  JV  is  the  total  number  of  resources  for  which  there  is  queueing  in  the  host  This 
"li  exp'rime"15  “d"  *  pr“,ucii»" 

•  .  ,ndex  introduced  in  [Ferrari8fl]  is  respom  time  oriented,  and  job  dependent 
Instead  of  a  unique  value  at  a  particular  moment  in  time,  the  load  of  a  host  differs  for 
different  jobs  because  of  their  varying  resource  demands,  which  are  assumed  to  be  known 

T  ,  ru“pti°“  cnab,e9  08  10  Predict  the  time  of  a  job  more 

urately  hence  to  make  better  load  balancing  decision.  However,  while  we  have  found 

‘PS  LtWeC“  the  argU“ent*  of  a  J°b  and  tbe  job’s  resource  demands 
[Zhou87c],  the  assumption  that  the  demands  of  a  job  are  known  in  advance  may  be  too 

S:  maDI  CaSeS-  /\th‘S  9tUdy’  We  investi«atc  ver9ioas  of  the  same  load  index  in 
which  the  coefficients  of  the  resource  queue  lengths  are  job  independent,  and  only  reflect 

the  relative  importance  of  the  resources  (with  respect  to  a  "basket”  of  jobs).  For  exam¬ 
ple.  we  can  use  unity  as  the  coefficients  to  reduce  the  linear  combination  to  the  sum  of  the 

esour'-e  queue  lengths,  that  is,  in  queueing  modeling  terms,  “the  number  of  jobs  (or 
processes)  in  the  system."  J  1  r 

Our  extensive  measurements  of  production  time-sharing  workload  show  that  the  sys- 
tem  load  „  changing  quite  rapidly  (Zhou87b).  On  top  of  a  low-frequency  main  component, 
there  are  a  number  of  high-frequency  load  components  that  may  be  regarded  as  “noise” 
rather  than  useful  information.  Using  the  instantaneous  resource  queue  lengths  may  give 
excessive  importance  to  such  no.se  and  lead  to  bad  job  transfer  decisions.  We  used  a 
smoothing  algorithm  to  compute  the  time-averaged  queue  length  and  compared  load 

nemg  performance  using  smoothed  queue  lengths  to  that  of  the  same  scheme  using 
instantaneous  queue  lengths.  ° 


3.  SYSTEM  AND  WORKLOAD 

we„  !V“*  environment  in  which  the  measurement, 

were  taken,  and  the  workloads  used  to  drive  the  system. 

System 

tnre  jmp,emented  a  d*namic  '°ad  balancer  for  Sun/UNIX  environments.  The  struc- 

H  T  u  18  “  FigUfe  lf-  The  UN,X  user  interface  Program,  esh  is 

mod  fled  so  tha  the  commands  typed  ,n  by  the  user  are  intercepted,  and  Le  of  them 

are  transferred  to  some  remote  host  for  execution  when  the  local  host  is  heavily  loaded* 

At  startup  time,  the  C-shell  reads  in  a  configuration  file  that  specifies  a  list  of  job  types 

t  To  distinguish  our  modified  C  shell  from  the  standard  one  IJorSOl  we  call  it  C  .*,//  tv  d  a  „ 

£r  *• — — » -  -  sasi  ssfs 

..  Berkeley  b,  Kerry  R„bi„  .„d  Veeke, 

8  tfte  B"kelejr  UNK  4  3  BSD  runnmg  on  VAX  machines  [Joy 83,  McKusick85|. 
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home  C-shell  and  are  terminated  when  the  home  C-shell  exits.  This  scheme  has  the 
potent'a!  problem  of  R-shell  prohferation.  However,  the  code  segment-  of  all  C-shell,  and 
R-shells  on  each  host  are  shared,  so  that,  when  an  R-shell  is  not  active,  almost  no 

rumcd-  fil”^e  retrieved  from  fiie  station,  ve 

diskless  only  the  command  line  needs  to  be  shipped,  and  the  cost  of  file  access  is  essen¬ 
tially  the  same  from  all  hosts. 

,  ^  baling  algorithm,  have  a  strong  influence  on  performance.  We  implemented 
nd  studied  a  number  of  algorithms  using  different  methods  for  load  information  exchange 
and  job  placement  [Zhou87a],  For  this  study  of  load  indices,  however,  we  just  selected 

Z'nod  P the'  Z  hTithmS'  that  j9’  th*  °“  Ca,,ed  GL0BAL  For  every  time 

host  extract,  load  information  from  the  local  kernel  to  com¬ 
pute  the  local  hose  s  load  index.  If  the  new  value  of  the  load  index  is  significantly  different 

maHon^f  PreV‘°US  T’  **  ^  "  8eBt  10  the  maater  LIM’  which  collect,  load  infor¬ 

mation  from  every  host  and  broadcast,  the  entire  load  vector  in  each  period  P  When  a 

job  whose  name  is  on  the  eligibility  list  is  submitted  to  a  host,  the  local  LIM  is  contacted 

the  le  tP  a^ment  |  ^  T  '°Tal  ,<>ad  '9  high'  tbe  hosl  reived  by  the  local  LIM  to  have 
the  least  load  is  selected,  and  the  job  is  sent  there. 

mirno^'l  ,“p,^“entation  de9cribed  ab°ve  provides  a  transparent,  low-cost,  and  general- 
purpose  load  balancer  whose  installation  requires  no  changes  to  the  kernelf  or  to  the 

r;rr  ??p‘r  sir  ,he  'mpi,Mis  °f  this 

Z'  -rS/'ZZ?  n  "J’i"  “0t  de”ribe  th'  “d  i-Pl.mM.rto. 

issues  in  more  detail.  The  interested  reader  is  referred  to  |Zhou87a], 

Workload 

A|f.  Work,oad  characteriration  and  selection  are  crucial  to  a  measurement  study 

oueht^  art,fiC,a  W°rk,0ads  con9iderab|y  increase  the  repeatability  of  experiment,  they 

in  the  p°  TreSw  “atUrja  work,oads  reasonably  well,  so  a,  to  strengthen  our  confidence 

UNIY  HScn  e  trarrd  a  production  VAX*H/780  machine  running  under  the  Berkeley 

anJLdTtnm  A°yt  ’  ******  for  »  ^tended  period  of'several  montL,  and 
analyzed  the  types  and  frequencies  of  the  commands  executed  by  the  system  On  the 

basis  of  such  an  analysis,  we  selected  30  frequently  executed  commands,  listen  Table  ! 
and  used  them  to  construct  job  scripts,  i.e.,  sequences  of  commands.  ’ 

•  »  To  obtai  YarioU9  leve,s-  °r  intensities,  of  a  host’s  load,  we  ran  a  variable  number  of 
job,  ,n  the  background.  The  artificial  workloads  were  not  intended  to  represent  t“etyni 
cal  worklo^s  of  personal  workstations,  but  rather  those  of  small  (i.e.,  not  very  powerful) 

*  J*nnS  SyStem*‘  Workstations  were  used  because  of  their  being  available  in  our  dis¬ 
tributed  systems  laboratory.  We  simulated  user  think  times  by  the  “*/«„"  eomm^n^ 
ewipt,  are  classified  into  three  levels:  light  (L),  moderate  (M),  and  heaJ^  (H)  with  a 

l“™^f  workload*  T"’*  C°“‘r"'“d  for  '“k  “  that  host,  subjected  to  the  same 

£?  index  va  ^  oUhTtl T  ?Dt,  ”rip“-  Tbe  °'  CPU 
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interval  (Cl)  of  the  values  of  the  performance  indices  over  these  replications. 

4.  DESIGN  AND  RESULTS  OF  THE  EXPERIMENTS 
Experimental  Factors 

Four  factors  were  identified  to  be  of  interest  in  the  study  of  load  indices: 

1)  Load  index.  We  used  as  load  indices  the  following  quantities:  the  instantaneous 
CPU  queue  length;  exponentially  averaged  CPU  queue  length;  the  sum  of  averaged 
CPU,  file  and  paging/swapping  I/O,  and  memory  queue  lengths!;  and  the  average 
CPU  utilization  over  a  recent  period.  Inside  the  kernel,  we  kept  variables  for  the 
queue  length  of  each  resource  type.  The  length  of  each  queue  was  sampled  every  10 
ms  by  the  clock  interrupt  routine,  and  used  to  compute  the  one-second  average 
queue  length,  qt*.  Exponential  smoothing  was  used  to  compute  the  average  queue 
length  over  the  last  T  seconds: 

Qi  ™  ■'  >  1 

0,-0 

2)  Averaging  interval  T.  For  exponentially  smoothed  values  of  a  resource  queue 
length,  and  for  the  average  CPU  utilisation,  the  interval  T  over  which  the  average  is 
computed  conceivably  affects  the  quality  of  the  index,  and  hence  the  system’s  perfor¬ 
mance. 

3)  Workload.  There  may  be  interactions  between  the  load  index  chosen  and  the  work¬ 
load  the  system  is  subjected  to.  Using  the  three  suites  of  host  workloads  described  in 
the  previous  section,  we  were  able  to  construct  several  combinations  of  system  work¬ 
load  for  the  six  workstations  in  our  system.  The  canonical  workload  consisted  of  two 
heavy,  two  moderate,  and  two  light  scripts  (2H,  2M,  2L).  We  also  studied  the  indices 
under  a  more  balanced  workload,  with  all  six  workstations  driven  by  moderate 
scripts  (6M). 

4)  Exchange  interval  P.  The  GLOBAL  algorithm  employs  periodic  updates  of  load 
information.  If  P  is  too  short,  the  overhead  may  be  too  high,  but,  if  P  is  too  long, 
then  job  placements  are  based  on  stale  information,  and  performance  may 
deteriorate,  and  system  instability  may  result. 

Measurement  Results 

We  shall  first  study  the  indices  and  the  averaging  interval  7*  by  fixing  the  workload 
at  its  canonical  level,  and  the  exchange  interval  at  10  seconds.  We  will  then  use  the  more 
balanced  workload  6M  to  examine  the  interactions  between  load  indices  and  workload. 
Finally,  we  will  study  the  effect  of  load  exchange  interval  P  on  performance. 


t  For  simplicity,  we  tnttsd  the  disk  queues  as  s  single  aggregate  queue  for  I/O  operations.  For  the 
memory  queue,  we  identified  a  number  of  places  inside  the  kernel  where  proceems  queue  up  for  seri¬ 
ous  types  of  memory  resources  (04.,  buffer  space,  page  table),  and  treated  all  these  as  a  single 
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Comparing  the  queue- length-based  indices  with  each  other,  we  notice  that  the 
exponentially  smoothed  indices  can  perform  best,  but,  if  the  averaging  period  T  is  too 
kmg  (e.g.»  >  20  s),  performance  may  even  become  worse.  Earlier  in  this  paper,  we  have 
pointed  out  that,  by  averaging  the  queue  lengths,  the  adverse  effect  of  the  high-frequency 
“noise’'  in  the  load  can  be  reduced.  This  is  reflected  by  improved  performance.  However, 
since  the  system  load  »  changing  all  the  time,  averaging  over  too  long  a  period  will 
emphasise  too  much  the  past  loads,  which  have  little  correlation  with  the  future  ones. 
The  optimum  averaging  interval  is  clearly  dependent  upon  the  dynamics  of  the  workload: 
the  faster  the  load  changes,  the  shorter  the  interval  should  be.  In  a  measurement  study  of 
production  workloads  on  a  VAX-11/780  running  Berkeley  UNIX  4.2BSD  [Zhou87b],  we 
found  that  the  average  net  ehanft  in  CPU  queue  length  in  30  seconds  was  2.31,  when  the 
average  CPU  queue  length  itself  was  4.12.  This  suggests  that  T  should  be  substantially 
shorter  than  30  seconds. 

The  performance  difference  between  the  cases  in  which  indices  based  on  CPU  queue 
alone  are  used,  and  those  in  which  indices  consider  I/O  and  memory  contention  also,  is 
not  significant,  suggesting  that  the  CPU  is  the  predominant  resource  in  our  hosts.  We 
found  that  the  I/O  and  memory  queue  lengths  were  generally  much  shorter  than  that  of 
CPU;  that  is,  the  former  are  much  less  contended  for.  It  should  be  pointed  out  that  our 
systems  support  general  computing  in  a  research  environment;  with  other  types  of  work¬ 
load,  e.g.,  database-oriented  one,  the  contention  profile  of  the  various  resource  types  may 
be  substantially  different.  However,  to  achieve  near-optimal  performance,  we  do  not  have 
to  consider  all  the  resources  in  the  system,  but  rather  only  those  with  significant  conten¬ 
tion.  We  also  studied  more  general  forms  of  linear  combinations  of  queue  lengths  by  using 
coefficients  other  than  unity,  but  no  significant  changes  in  performance  were  observed. 
This,  again,  is  probably  due  to  the  dominating  influence  of  the  CPU  queue. 

The  load  average  shown  in  Table  3  is  an  index  provided  by  a  UNIX  command;  it  is 
the  exponentially  smoothed  number  of  processes  ready  to  run,  or  running,  or  waiting  for 
some  high-priority  event  (e.g.,  disk  I/O  completion).  A  number  of  load  balancers  con¬ 
structed  in  the  past  in  the  UNIX  environment  have  used  the  load  average  as  their  load 
index  (e.g.,  [Bershad85]).  This  research  shows  that  significant  further  improvement  can  be 
obtained  by  using  indices  that  more  accurately  reflect  the  current  queueing  at  the 
resources. 

The  performances  produced  by  the  indices  under  the  more  balanced  workload  6M  is 
shown  in  Table  4.  Since  the  workload  is  now  more  balanced  and  moderate,  the  amount  of 
improvement  in  response  time  is  not  as  much  as  that  under  the  canonical  workload;  how¬ 
ever,  the  relative  rankings  of  the  indices  are  quite  similar.  This  suggests  that  the  above 
analyses  of  the  qualities  of  the  indices  and  the  appropriate  values  for  T  remain  valid 
under  a  more  balanced,  moderate  workload.  It  is  worth  noting  that,  in  this  ease,  due  to 
the  smaller  improvement,  using  a  poor  load  index  (e.g.,  load  average  or  00  s  CPU  utilisa¬ 
tion)  may  yield  little  or  no  performance  improvement. 

Finally,  we  study  the  influence  of  the  load  exchange  period  P.  Figure  2  shows  the 
mean  job  response  time  as  a  function  of  P,  and  with  the  other  three  factors  fixed.  The 
brackets  around  the  data  points  show  their  90%  confidence  intervals.  When  the  exchange 
period  P  is  very  abort,  the  load  information  used  in  job  placements  is  generally  up  to 
date,  but  this  positive  influence  is  outweighed  by  high  message  overhead.  Conversely,  if  P 
is  too  long,  the  information  may  get  stale,  the  quality  of  job  placements  deteriorates,  and 
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Figure  2.  Mean  process  response  tine  under  various  load  exchange  periods  P 
(Canonical  workload,  load  index  4  s  CPU+l/O+Mem  ql). 


criteria  reasonably  well:  the  queue  length  is  an  accurate  measure  of  a  resource’s  load,  and 
smoothing  over  a  short  interval  into  the  past  gives  predictive  capabilities  to  the  value  of 
the  index,  as  well  as  stability  against  the  noise  in  the  load  waveform.  Queue-length-based 
load  indices  also  appear  to  be  more  adaptable  to  a  heterogeneous  environment,  but  more 
studies  are  needed  to  substantiate  this  conjecture. 

Our  results  support  indices  compatible  with  the  one  proposed  in  [Ferrari80],  as  they 
can  be  seen  as  degenerate  forms  of  that  index.  However,  the  comparisons  performed  in 
this  study  are  far  from  being  complete.  We  decided  to  use  the  same  load  balancing  algo* 
rithm  for  all  the  indices,  so  that  the  qualities  of  the  load  indices  may  be  directly  compar¬ 
able.  On  the  other  hand,  the  algorithm  limited  the  varieties  of  load  indices  that  could  be 
studied.  We  demonstrated,  using  a  particular  set  of  workloads  and  in  a  particular  com¬ 
puting  environment,  that  linear  combinations  of  resource  queue  lengths  may  be  good  load 
indices.  No  proof,  however,  is  offered  that  they  are  the  best. 


14 


preceding  page  was  not  film 


Systems,  pp.  54-69,  May  1986. 

(Lhmy82| 

M.  Livny  and  M.  Melinas,  “Load  Balancing  is  Homogeneous  Broadcast  Distributed  Sys¬ 
tems,”  Proc.  ACM  Computer  Network  Performance  Symposium,  pp.  47-55,  April  1082. 

[MuKusick85] 

K.  McKusick,  M.  Karels,  and  S.  Leffler,  “Performance  Improvements  and  Functional 
Enhancements  in  4.3  BSD,”  Proc.  Summer  USENDC  Conference,  June  1085,  Portland,  OR, 
pp.  510^531. 

|Waag85|  _ 

Y.  Wang  and  R  Morris,  “Load  Balancing  in  Distributed  Systems,"  IEEE  Trans.  Comp. 
VoI.C-34,  No.3,  pp  204-217,  March  1085. 

|Zhou86] 

S.  Zhou,  “A  Trace-Driven  Simulation  Study  of  Dynamic  Load  Balancing,”  Tech.  Rept  No. 
UCB/CSD  87/305,  September  1086,  also  submitted  for  publication. 

(Zhou87aj 

S.  Zhou  and  D.  Ferrari,  “An  Experimental  Study  of  Load  Balancing  Performance,”  Tech. 
Rept  No.  UCB/CSD  87/336  January  1087,  also  submitted  for  publication. 

(Zhou87b| 

S.  Zhou,  “An  Experimental  Assessment  of  Resource  Queue  Length  as  Load  Indices,”  Proc. 
Winter  USENDC  Conference,  Washington,  D.C.,  pp.  73-82,  January  21-24, 1087. 

(Zhou87c] 

S.  Zhou,  “Predicting  Job  Resource  Demands:  a  Case  Study  in  Berkeley  UNIX,”  in  prepara¬ 
tion. 


